next up previous contents
Next: Underspecified Representations Up: Perspectives on the representation Previous: Multiple Lexical Entries

Regular Polysemy

A body of work, beginning most prominently with apresjan:74, concentrates on identifying regular shifts in meaning which can occur with particular classes of words. Apresjan (1974:16) defines regular polysemy as follows:

Polysemy of the word A with the meanings a tex2html_wrap_inline32392 and a tex2html_wrap_inline33788 is called regular if, in the given language, there exists at least one other word B with the meanings b tex2html_wrap_inline32392 and b tex2html_wrap_inline33788 , which are semantically distinguished from each other in exactly the same way as a tex2html_wrap_inline32392 and a tex2html_wrap_inline33788 and if a tex2html_wrap_inline32392 and b tex2html_wrap_inline32392 , a tex2html_wrap_inline33788 and b tex2html_wrap_inline33788 are nonsynonomous.
He suggests that many types of regular polysemy are productive -- that if a word has a meaning of a particular type (e.g. the type of a tex2html_wrap_inline32392 and b tex2html_wrap_inline32392 above), it can also be used with the meaning of another type (e.g. the type of a tex2html_wrap_inline33788 and b tex2html_wrap_inline33788 ). From this observation the idea of capturing this kind of polysemy in terms of statements (rules) expressing the relationship between semantic types developed.

Taking the systematic relationships identified by apresjan:74 and others discussed by miller:78, Clark & Clark (1979), aronoff:80, and lehrer:90 as a starting point, ost_atk:91 argue that knowledge of such lexical semantic relationships forms part of linguistic knowledge, and cannot simply be a reflection of regularities in the world or ``speakers' free play with analogy'' (p. 77), because of the interaction of lexical factors with the application of the relationships (e.g. blocking). They argue for explicit representation of the relationships in terms of Lexical Implication Rules (LIRs) which generate derived lexical entries from base lexical entries. In this sort of rule, we find the starting point for a dynamic view of the (computational) lexicon -- by making generalisations about potential syntagmatic alternations which a class of words can undergo, one aspect of creativity in language use can be accounted for. For example, if a system captures a rule such as ``a countable noun which refers to an animal can be used as an uncountable noun which refers to the meat from that animal'', it would have a basis for distinguishing the senses of dog in John walked his dog and We ate dog for dinner last night, and further would allow for novel instances of this animal tex2html_wrap_inline31356 meat rule, such as I tasted aardvark yesterday. This sense of aardvark cannot be considered established and a part of the fixed lexicon, but can still be interpreted given knowledge of the lexical generalisation.

The use of lexical rules depends on a structured lexicon. That is, it depends on a lexicon in which various generalisations are captured about the semantic classes which words in the lexicon belong to (e.g. animal nouns). This stems from the need to constrain the input to the lexical rules, to restrict their application to those words which participate in the regular relationships. It is difficult to imagine a semantically motivated way of defining the appropriate input to a rule in the absence of represented semantic relationships between words. Lexical rules would be extremely difficult to formulate under an unstructured multiple lexical entry view of the lexicon, in which entries are not grouped into classes.

One of the main issues involved with the use of lexical rules such as these in a computational framework is how to avoid spurious ambiguities which might be generated by these rules. I hinted above at the issue of blocking, which occurs when the application of a rule is blocked through the prior existence in the lexicon of a word which is already in the place of the potential output. Two kinds of blocking are semantic pre-emption, where the sense to be generated is represented by a previously existing word (consider beef which blocks the generation of cow on a meat sense), and lexical pre-emption, where the word form to be generated already exists in the lexicon with a different sense. These phenomena led ost_atk:91 to argue that LIRs must have clear directionality and that blocking phenomena can be formulated as constraints on the application of the LIRs.

The view of the directionality of lexical relationships has, however, been challenged by, among others, nunberg:78 and Bredenkamp (1996), and this challenge is implicit in the bi-directional representation of lexical rules utilised by pinker:89. One central question which arises if directionality of lexical rules is assumed is how to determine which of the senses is the base sense and which is the derived sense. As pointed out by Kilgarriff (1992:89-90), there are cases in which one sense of a systematic relationship is most salient for certain instantiations of it, while the other sense is more salient for other instantiations. An example is found in the tree/wood relationship: the name of a tree can be used to refer to its wood. For oak, ash, and many other trees/woods, the tree sense is most salient. However, for teak and mahogony, the wood sense seems to be most salient. Similarly, for turkey, the meat sense is more salient than the animal sense, while the reverse is true for dog. Kilgarriff addresses this problem by expressing the alternation both at the node for TREE (relating it to WOOD) in his ontology, and at the node for WOOD (relating it to TREE). This is surely unnecessary redundancy. So how, then, could an NLP system instead deal with the problem of blocking?

copestake_briscoe:95 and copestake:95 propose adding to lexical entries conditional probabilities that reflect how likely a word is to be used in a specific sense.gif Thus although a lexical rule will generate a ``derived'' sense from a ``base'' sense, different frequencies may be associated with the two senses, and for some word forms (e.g. teak) the derived sense will have a higher frequency than the base sense. Under such proposals, novel usages of a word form can be derived through productive application of a lexical rule but the NLP system will have a measure reflecting word usage. Established word senses will be associated with high frequencies, while non-established senses will have low frequencies (and should thus be avoided in, for example, an NLG system). Frequency information could be used to guide parsing, through preference of high-frequency senses. Ambiguous word forms would be parsed initially using the high-frequency sense; low-frequency senses should only be chosen in the case of a syntactic conflict or as a result of subsequent pragmatic processing which determines that it is the ``correct'' sense.

Furthermore, there are many exceptions to the ``constraints'' imposed on such lexical rules through blocking (cf. We had mad cow for dinner last night) and so application of the rules should not be prevented altogether, even when there are lexical items which might be considered to pre-empt the derived form. Rather, the frequency information would give a clue about how common a particular derived form is, and would obviate the need for hard constraints. It would also guide interpretation, in that the use of an extremely low-frequency form very likely indicates that the speaker chose the word carefully, and wants to convey something additional at the pragmatic level (such as distaste for the meat of a cow). This could be used to trigger pragmatic processing.

In sum, the use of lexical rules in NLP systems can increase the flexibility of the lexicon in those systems. Lexical rules define the space of regular, productive sense extensions which can be used to generate new senses from existing ones. The addition of frequency probabilities to the lexicon can aid in preventing non-established senses from being used in generation, and can influence the interpretation process.


next up previous contents
Next: Underspecified Representations Up: Perspectives on the representation Previous: Multiple Lexical Entries